可解释的AI(XAI)是支持高赌注视觉检测任务的人AI合作的承诺手段,例如来自卫星成像仪的损坏检测任务,作为完全自动化的方法不太可能是完全安全可靠的。然而,大多数现有的XAI技术都没有通过对人类的特定任务特定需求进行解释来了解。因此,我们迈向了解Xai人类在损坏检测任务中需要什么迈出的第一步。我们在在线众包的研究中了解人们如何在评估基于卫星图像的建筑损坏的严重程度时解释自己的评估。通过与60人群的研究,我们介绍了六种主要策略,即人类利用解释他们的视觉伤害评估。我们对我们的调查结果提出了对这种视觉检测环境的设计设计的影响,并讨论了未来研究的机会。
translated by 谷歌翻译
虽然印度是Covid-19的热点之一,但来自该国的大流行的数据已被证明在规模上很大程度上无法进入。在网络上的非结构化形式中存在大部分数据,并且通过志愿者努力通过手动维护的公共API获得了有限的方面。这在易于获取详细数据和维护手动数据随时间的维护方面,这一直困难。本文有关我们在古典PDF解析器和最先进的机器学习技术的帮助下自动化公共卫生公告的提取自动提取这些数据的努力。在本文中,我们将描述自动化数据提取技术,所生成的数据的性质,以及正在进行的工作的令人兴奋的途径。
translated by 谷歌翻译
As language models have grown in parameters and layers, it has become much harder to train and infer with them on single GPUs. This is severely restricting the availability of large language models such as GPT-3, BERT-Large, and many others. A common technique to solve this problem is pruning the network architecture by removing transformer heads, fully-connected weights, and other modules. The main challenge is to discern the important parameters from the less important ones. Our goal is to find strong metrics for identifying such parameters. We thus propose two strategies: Cam-Cut based on the GradCAM interpretations, and Smooth-Cut based on the SmoothGrad, for calculating the importance scores. Through this work, we show that our scoring functions are able to assign more relevant task-based scores to the network parameters, and thus both our pruning approaches significantly outperform the standard weight and gradient-based strategies, especially at higher compression ratios in BERT-based models. We also analyze our pruning masks and find them to be significantly different from the ones obtained using standard metrics.
translated by 谷歌翻译
State-of-the-art automatic augmentation methods (e.g., AutoAugment and RandAugment) for visual recognition tasks diversify training data using a large set of augmentation operations. The range of magnitudes of many augmentation operations (e.g., brightness and contrast) is continuous. Therefore, to make search computationally tractable, these methods use fixed and manually-defined magnitude ranges for each operation, which may lead to sub-optimal policies. To answer the open question on the importance of magnitude ranges for each augmentation operation, we introduce RangeAugment that allows us to efficiently learn the range of magnitudes for individual as well as composite augmentation operations. RangeAugment uses an auxiliary loss based on image similarity as a measure to control the range of magnitudes of augmentation operations. As a result, RangeAugment has a single scalar parameter for search, image similarity, which we simply optimize via linear search. RangeAugment integrates seamlessly with any model and learns model- and task-specific augmentation policies. With extensive experiments on the ImageNet dataset across different networks, we show that RangeAugment achieves competitive performance to state-of-the-art automatic augmentation methods with 4-5 times fewer augmentation operations. Experimental results on semantic segmentation, object detection, foundation models, and knowledge distillation further shows RangeAugment's effectiveness.
translated by 谷歌翻译
In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data. Basically, we design and synthesize a wide range of potential errors and check whether they result in a commensurate drop in the metric scores. We examine a range of recently proposed evaluation metrics based on pretrained language models, for the tasks of open-ended generation, translation, and summarization. Our experiments reveal interesting insensitivities, biases, or even loopholes in existing metrics. For example, we find that BERTScore ignores truncation errors in summarization, and MAUVE (built on top of GPT-2) is insensitive to errors at the beginning of generations. Further, we investigate the reasons behind these blind spots and suggest practical workarounds for a more reliable evaluation of text generation.
translated by 谷歌翻译
Multi-lingual language models (LM), such as mBERT, XLM-R, mT5, mBART, have been remarkably successful in enabling natural language tasks in low-resource languages through cross-lingual transfer from high-resource ones. In this work, we try to better understand how such models, specifically mT5, transfer *any* linguistic and semantic knowledge across languages, even though no explicit cross-lingual signals are provided during pre-training. Rather, only unannotated texts from each language are presented to the model separately and independently of one another, and the model appears to implicitly learn cross-lingual connections. This raises several questions that motivate our study, such as: Are the cross-lingual connections between every language pair equally strong? What properties of source and target language impact the strength of cross-lingual transfer? Can we quantify the impact of those properties on the cross-lingual transfer? In our investigation, we analyze a pre-trained mT5 to discover the attributes of cross-lingual connections learned by the model. Through a statistical interpretation framework over 90 language pairs across three tasks, we show that transfer performance can be modeled by a few linguistic and data-derived features. These observations enable us to interpret cross-lingual understanding of the mT5 model. Through these observations, one can favorably choose the best source language for a task, and can anticipate its training data demands. A key finding of this work is that similarity of syntax, morphology and phonology are good predictors of cross-lingual transfer, significantly more than just the lexical similarity of languages. For a given language, we are able to predict zero-shot performance, that increases on a logarithmic scale with the number of few-shot target language data points.
translated by 谷歌翻译
Finetuning image-text models such as CLIP achieves state-of-the-art accuracies on a variety of benchmarks. However, recent works like WiseFT (Wortsman et al., 2021) and LP-FT (Kumar et al., 2022) have shown that even subtle differences in the finetuning process can lead to surprisingly large differences in the final performance, both for in-distribution (ID) and out-of-distribution (OOD) data. In this work, we show that a natural and simple approach of mimicking contrastive pretraining consistently outperforms alternative finetuning approaches. Specifically, we cast downstream class labels as text prompts and continue optimizing the contrastive loss between image embeddings and class-descriptive prompt embeddings (contrastive finetuning). Our method consistently outperforms baselines across 7 distribution shifts, 6 transfer learning, and 3 few-shot learning benchmarks. On WILDS-iWILDCam, our proposed approach FLYP outperforms the top of the leaderboard by $2.3\%$ ID and $2.7\%$ OOD, giving the highest reported accuracy. Averaged across 7 OOD datasets (2 WILDS and 5 ImageNet associated shifts), FLYP gives gains of $4.2\%$ OOD over standard finetuning and outperforms the current state of the art (LP-FT) by more than $1\%$ both ID and OOD. Similarly, on 3 few-shot learning benchmarks, our approach gives gains up to $4.6\%$ over standard finetuning and $4.4\%$ over the state of the art. In total, these benchmarks establish contrastive finetuning as a simple, intuitive, and state-of-the-art approach for supervised finetuning of image-text models like CLIP. Code is available at https://github.com/locuslab/FLYP.
translated by 谷歌翻译
在现代世界中,数据科学和分析以优化或预测结果的应用无处不在。数据科学和分析已经优化了市场中存在的几乎所有领域。在我们的调查中,我们专注于如何在体育领域采用分析领域,以及它如何促进游戏的转型,从评估现场玩家及其选择到赢得团队的预测以及大型体育比赛的门票和商业方面的营销。我们将介绍体育分析领域采用的不同运动的分析工具,算法和方法论,并介绍我们对同一体育的看法,我们还将比较和对比这些现有方法。通过这样做,我们还将介绍任何希望尝试体育数据并分析游戏的各个方面的人考虑的最佳工具,算法和分析方法。
translated by 谷歌翻译
数字技术的发展和体育运动的日益普及激发了创新者,通过引入幻想体育平台FSP,将体育倾向的用户带到一个全新的不同层次上。数据科学和分析的应用在现代世界中无处不在。数据科学和分析打开门,以获得更深入的理解和帮助,以帮助决策过程。我们坚信,我们可以采用数据科学来预测FSP上的获胜幻想板球团队,Dream 11.我们建立了一个预测模型,可以预测潜在游戏中玩家的性能。我们结合了贪婪和背包算法的组合,开出了11名球员的组合,创建了一支幻想板球团队,这是最重要的统计赔率,即最大的团队成为最强的团队,从而使我们有更大的机会赢得梦想中的赌注。 11 FSP。我们使用Pycaret Python库来帮助我们理解并采用最佳回归算法来进行问题陈述,以做出精确的预测。此外,我们使用Plotly Python图书馆为我们提供了对团队的视觉见解,并且玩家通过计算前瞻性游戏的统计和主观因素来表演。交互作用图帮助我们提高了我们的预测模型的建议。您要么赢得大,赢得小巧,要么根据预期游戏中为您的幻想团队选出的球员的表现而失去赌注,而我们的模型增加了您赢得大的可能性。
translated by 谷歌翻译
口音构成了识别文化,情感,行为等的组成部分。人们经常由于口音而以不同的方式相互感知。口音本身可以是地位,自豪感和其他情感信息的传送带,可以通过语音本身捕获。口音本身可以定义为:“特定领域,国家或社会群体中的人的单词”或“在单词中给出的音节,句子中的单词或一组音符的特殊强调的方式音符”。语音识别是语音识别领域中最重要的问题之一。语音识别是计算机科学和语言学研究的跨学科子场,其中的主要目的是开发能够将语音转换为文本的技术。演讲可以是任何形式的,例如阅读语音或自发演讲,对话言语。语音与文本不同,有很多多样性。这种多样性源于环境条件,说话者到扬声器的变化,渠道噪音,由于残疾而导致的言语产生差异,存在不足。因此,语音确实是等待被利用的丰富信息来源。
translated by 谷歌翻译